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ABSTRACT 


Mental  measurement  in  the  Armed  Forces  is  en  ebsoluto  necessity. 
Two  interrelated  existing  problems  must  be  resolved  before  mental 
measurement  can  be  used  most  effectively.  First,  a  determination  of 
which  jobs  or  occupations  need  oo  be  filled  in  the  Armed  Forces  is 
necessary,  and,  additionally,  performance  must  be  measured.  The 
second  problem  is  testing  or  mental  measurement.  Test  development 
must  be  centered  around  a  job  ar  occupation  or  a  series  of  jobs  or 
occupations  and  correlates  ag£ inst  job  or  occupation  performance. 

A  partial  review  of  the  literature  on  military  testing  indicates 
that  testing  is  being  conducted  without  resolving  the  first  problem 
in  any  sound  testing  program.   Additionally,  there  are  indications 
that  correlation  studies  of  currently  used  tests  are  frequently  not 
conducted  in  an  unbiased  scientific  manner.   High  correlation  coefficients, 
no  matter  how  they  are  obtained,  have  possibly  become  the  ultimate 
goal  of  military  testing. 
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CHAPTER  I 
HISTORICAL  BACKGROUND 

Military  commanders  have  for  many  centuries  judged  the 
mental  capacity  of  their  subordinates.  This  was  and  is 
today  one  of  the  principle  elements  military  commanders  must 
consider  when  assigning  personnel.   Selection  and  assignment 
of  personnel  during  the  nineteenth  century,  a  period  of 
relatively  small  regional  wars,  was  based  upon  political 
considerations,  judgement  of  mental  capacities,  technical 
know-how,  and  necessity  in  that  order. 

Twentieth  century  global  warfare  required  many  more  men 
on  the  battle  field,  each  assigned  so  that  his  contribution 
to  the  war  effort  would  be  maximized .  Military  commanders 
found  it  impossible  to  evaluate  millions  of  men  on  a  personal 
judgement  basis  when  the  United  States  entered  World  War  I. 
The  American  Psychological  Association  appointed  a  committee 
to  study  this  problem.  A  method  of  determining  the 
intellectual  level  of  millions  of  men  was  desired.  This  was 
considered  necessary  for  the  rapid  classification,  trainings 
and  assignment  of  men  to  different  types  of  service  in  order 
to  save  time  and  to  properly  utilize  available  human 
resources. 

Prior  to  World  War  I,  Psychological  testing  had  not,  in 
general,  attempted  to  measure  individual  differences  for  the 


purpose  of  personnel  placement <,  Mental  measurement  tests 
had  been  developed  for  use  in  clinical  examination  of 
psychiatric  patients.   In  addition,  Cattel,  in  l£90, 
described  a  series  of  tests  which  were  being  used  to 
determine  the  intellectual  level  of  college  students, 
although  his  ideas  that  "a  measure  of  intellectual  functions 
could  be  obtained  through  tests  of  sensory  discrimination  and 
reaction  time"  appears  to  be  somewhat  in  error, 

A.   BASIC  MILITARY  NEEDS 

A  group  test  was  desired  to  handle  the  classification 
problems  of  World  War  I.   Army  psychologists  drew  upon  all 
available  test  material,  much  of  which  was  unproven  as  to 
its  usefulness,  and  developed  the  Army  Alpha  and  Beta  tests „ 
These  tests  were  well  suited  for  group  use ,  where  only  a 
very  general  classification  was  required.  The  Alpha  test 
was  designed  for  literate  groups  while  the  Beta  was  for  use 
with  those  who  were  not  literate  in  the  English  language. 
However,  the  Beta  test  proved  to  be  less  valid  than  the 
Alpha,  but  it  was  sufficiently  discriminate  for  emergency 
use.   An  additional  group  test,  the  "Personal  Date  Sheet"  was 
used  in  World  War  I  to  screen  out  those  individuals  with 


Anne  Anastasi  Psychological  Testing,  The  MacMillan 
Company,  New  York,  1955,  p.9V 


Psychological  difficulties.   Psychologists  also  developed 
tests  of  special  or  specific  abilities  that  proved  to  be 
moderately  useful.   Army  psychologists  in  collaboration  with 
their  civilian  contemporaries  were  able  to  develope  a  group 
intelligence  test  that  contributed  immeasurably  to  the 
solution  of  the  emergency  classification  problems  and 
ultimate  success  of  the  war  effort. 

Group  tests  developed  by  the  Army  in  World  War  I  were 
eagerly  accepted  for  civilian  use  after  the  war.  Group  tests 
became  the  panacea  in  personnel  selection  and  placemento 
This  movement  encompassed  peoples  of  all  ages  and  groups 0 
Studies  of  special  groups  were  undertaken  for  various  reasons, 
The  use  of  group  tests  became  indiscriminate  and  when  the 
results  failed  to  meet  expectations  much  hostility  and 
skepticism  developed.  Much  of  this  hostility  and  skepticism 
is  still  present,  one  and  a  half  generations  later,  for 
reasons  which  were  and  may  still  be  well  founded. 

B.   ADVANCES  IN  MILITARY  TESTING 

During  the  interim  between  World  Wars  I  and  II,  and  the 
advent  of  group  testing,  many  advances  were  made  in  the  use 
of  mental  measurement  tests.   The  Navy's  Bureau  of  Navigation 
organized  a  personnel  testing  program  as  a  part  of  its 
training  division  in  192l+»     A  General  Classification  Test 
was  used  at  training  stations  to  select  enlisted  men  for 


Navy  schools.   Later,  this  same  test  was  used  as  a  screening 
device  at  recruiting  stations.   Other  tests  were  also 
introduced  and  by  December  1941  the  following  tests  were  in 
p-eneral  use  at  recruit  training  stations:   "General 
Classification,  Mechanical  Aptitude  Test,  Arithmetic  Tests 
English  Test,  Spelling  Test,  and  Radio  Aptitude  Teste"2 

These  tests  had  served  well  during  peacetime  when  the 
ratio  of  selection  to  applicants  for  Naval  service  was 
rather  low.  However,  when  this  ratio  was  raised  as  mass 
mobilization  became  a  necessity,  the  tests  were  found  to  be 
grossly  inadequate  for  selection  purposes.  There  was  little 
differentiation  between  good  men  and  their  capabilities  in 
various  rates.  Training  schools  found  that  many  men  enrolled 
had  little  capability  in  their  assigned  specialty0  Local 
testing  programs  developed  at  many  stations  in  an  attempt  to 
overcome  these  difficulties. 

By  May  1942,  the  enormity  of  the  personnel  testing 
program  and  its  concomitant  problems  was  recognized „  A 
request  for  assistance  was  made  to  the  Office  of  Scientific 
Research  and  Development,   As  a  result  of  this  and  other 
developments,  a  two  pronged  attack  was  launched  to  resolve 
the  problems  related  to  testing  as  soon  as  possible o 


2 

Personnel  Research  and  Test  Development  in  the  Bureau 

of  Naval  Personnel,  Ed9 ,  Dewey  B.  Stuit,  Princeton  University 
Tress  1947,  p.  6, 


The  problems  were  essentially  divided  into  two  parts. 
First,  since  the  present  tests  revealed  a  lack  of  validity^ 
tests  had  to  be  developed  which  were  valid.  Second,  little 
was  known  about  the  requirements  of  Navy  training  on  a  mass 
basis.  Knowledge  concerning  the  second  aspect  of  this 
problem  was  necessary  before  the  first  part  could  be  resolved, 
This  emergency  was  met  in  much  the  same  fashion  as  the 
identical  problem  of  selection,  classification,  and  placement 
was  met  in  World  War  I,  Both  the  Army  and  Navy  faced  these 
problems  and,  as  before,  psychologists  and  personnel 
officials  of  both  services  pooled  their  resources,  procured 
civilian  assistance  and  civilian  tests,  and  tests  began  to 
improve  in  validity  and  continued  to  do  so  after  late  1942 
for  the  remainder  of  the  war. 

Following  World  War  II,  in  1946,  a  permanent  research 

organization  was  approved  tos^ 

Undertake  a  coordinated  program  of  personnel  research 
and  test  development  centered  around  the  major 
personnel  problems  of  the  NAVT00„to  conduct  studies 
on  personnel,  policy,  techniques,  and  procedures,  and 
on  the  assignment,  evoluation,  promotion  or  advancements, 
and  morale  of  officer  and  enlisted  personnel;  0«oand  to 
develop  such  psychological  and  educational  tests  and 
other  instruments  as  may  be  necessary  for  the  selection , 
classification,  training,  and  evaluation  of  performance 
of  Navy  personnel.  (3) 


3 

Future  wars  may  not  require  such  mass  mobilization 9   and 
in  any  event  will  surely  not  allow  sufficient  time  for  the 
construction  of  tests  which  are  valid  enough  to  be  used  as  a 
reliable  guide  for  the  selection,  placement  and  training  of 
personnel o 

4-Stuit  o£.  cit«  p.  11 
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This  organization  has  had  various  titles  over  the  last 
eighteen  years  and  has  made  many  recommendations  for 
improving  the  Navy*s  personnel  administration „   Fund  limit- 
ations, as  well  as  opposition  to  change,  has  prevented  full 
implementation  of  the  program  and  its  recommendations . 

C.   SELECTED  NAVY  TESTS 

As  a  result  of  the  groundwork  laid  during  World  War  II 
and  the  permanent  organization  established  as  the  Personnel 
Research  Division  of  the  Bureau  of  Naval  Fersonnel  shortly 
after  the  war,  many  methods  of  mental  measurement  have  been 
devised.  This  discussion  will  be  limited  to  those  tests 
considered  basic  with  mention  of  other  special  Navy  tests 0 
Enlisted  Basic  Test  Battery 

The  General  Classification  Test  (GCT) 

.••is  a  100-item  test  designed  to  measure  the  ability 
to  comprehend  material  of  a  verbal  nature •  The  1+0 
sentence-completion  items  and  60  verbal-analogy  items 
which  comprise  the  test  are  arranged  in  order  of 
increasing  difficulty.  The  testee  is  to  select  the 
one  most  correct  answer  from  the  five  possible  answers 
which  are  giver.  A  time  limit  of  35  minutes  is  used0 
(5) 

This  test,  like  all  Navy  tests  designed  for  mental 
measurement,  is  standardized  and  differences  can  be  readily 
established.  Appendix  A  illustrates  the  comparison  of  the 


5 

Development  and  Standardization  of  the  U.  S0  Navy  Basic 
Test  Battery,  Form  6,  Bureau  of  Naval  Personnel  Research 
Report  5#-2,  U.  S.  Naval  Personnel  Research  Field  Activity , 
San  Diego,  Calif.  Nov.  195#,  p.  1» 


Navy?s  standard  T-scores  with  Z-scores,  Staniness  and  IQ 

scores.   All  of  these  different  methods  of  scoring  are  based 

on  a  normal  probabilit}'  curve. 

The  Arithmetic  Test  (ARI)  is  designed  as, 

...two  separately-timed  subtests ,  a  20-item  Arithmetic 
Computation  Subtest  and  a  30-item  Arithmetic  Reasoning 
Subtest.   Both  kinds  of  items  are  in  five  alternative 
multiple  choice  form.  (6) 

Time  limits  are  also  established  for  both  of  these 

subtests  and  are  12  and  35  minutes  respect ively„ 

A  Mechanical  Test  (Mech)  is  designed  as, 

...two  separately-timed  50-item  subtests.  Tool  Knowledge 
and  Mechanical  Comprehension.  ...time  limits  are  10 
minutes  for  the  Tool  Knowledge  Subtest  and  25  minutes 
for  the  Mechanical  Comprehension  Subtest „ 

Each  tool  knowledge  item  consists  of  five  pictures  of 
mechanical  or  electrical  tools  or  eauipment.  The  testee 
is  to  select  from  the  last  four  objects  pictured  the  one 
which  is  most  closely  associated  with  the  tool  or 

OD  J6C  I-       lii      T/flG       I  irSL       piCX/l-li   ©o    oooo©&GOOoooooooo©*©»«ooe»eo 

Each  mechanical-comprehension  item  consists  of  one  or 
more  drawings  in  which  a  mechanical  problem  is  presented. 
The  testee  is  to  show  vrhether  he  understands  the 
mechanical  principles  involved  by  marking  one  of  the  three 
possible  answers  provided.  (7) 

The  fourth  and  last  of  the  Navy*s  Enlisted  Basic  Test 

Battery  is  the  Clerical  (CLER)  which  is  designed  to 

...measure  the  ability  to  observe  quickly  and  accurately s 
consists  of  240  pairs  of  five-to-nine-digit  numbers  which 
must  be  compared  at  a  high  rate  of  speed.   The  examinee 
indicates  whether  the  two  members  of  the  pair  are  the 
same  or  are  different  by  marking  an  "S"  or  "0"  in  the 
adjoining  answer  space.  (S) 


6Ibid.  p.  2 
7Ibid.  pp.  2-3 
Ibid,  pp.  3-4 


Other  Navy  Enlisted  Tests 

In  order  to  obtain  supplementary  information  necessary 
for  proper  classification  of  Enlisted  personnel  several 
special  tests  have  been  divisedo  These  tests  include  but 
are  not  limited  to  the  following?  (9) 

la   An  Electronics  Technicians  Selection  Test 

2.  Radio  Code  Test 

3.  Telephone  Talker  Test 

4-   Sonar  Pitch  Memory  Test 

10 

5.  Navy  Literacy  Test 

6.  Non-verbal  Classification  Test 

One  other  test,  perhaps  the  most  important  of  all  Navy 
Enlisted  Tests,  is  the  Advanced  Technicians  Test0   Because 
of  the  increasing  complexity  of  today's  scientific  and 
technological  requirements  more  effective  screening  methods 
are  provided  for  advanced  technical  training  by  the  use  of 
a  new  test.  This  test,  The  Advanced  Technicians  Test, 
consists  of  four  parts,  Reading  Comprehension ,  Mathematics , 
Physics,  and  Electricityo 

The  Advanced  Technicians  Test  does  not  replace  the  basi« 
test  batteries.   It  is  given  to  enlisted  personnel  in  second 
or  subsequent  enlistments  and  results  recorded  on  page  3  of 
the  service  record  and  on  the  Bureau  of  Naval  Personnel 


"information  and  Education  Manual ,  NavPers  16,963D, 
Bureau  of  Naval  Personnel,  Aug.  1955,  P°  70o 

This  test  is  designed  for  those  who  cannot  read  English 
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Enlisted  Master  Tape,   Tests  are  given  at  designated  Enlisted 
Classification  Units  and  retests  are  not  authorized,,   Tests 
are  to  be  administered  to  the  following  categories  of 
personnel  not  previously  tested:  (11) 

1„   qualified  submariners 

2„   less  than  12  years  services 

3o  all  others  who  desire  to  be  tested 

4»  all  first  reenlistees 

5o  all  applicants  for  nuclear  power  training 
Basic  Tests  for  Naval  Officer  Personnel 

The  need  for  an  intellectual  screening  device  for  officer 
personnel  was  extremely  urgent  owing  to  the  necessity  for  a 
rapid  expansion  of  the  Navy  early  in  World  War  II „   In  an 
effort  to  meet  this  urgent  requirement ,  two  groups  of  tests 
were  developed  in  1942  and  1943 o  These  tests  were  to  be 
given  and  used  as  a  part  of  an  initial  screening  of  applicants 
and  as  a  classification  tool  after  applicants  had  been 
processed  beyond  this  initial  stage « 

Basis  among  the  tests  developed  was  the  Officer 
Qualification  Test.   This  test,  still  used  with  modifications s 
consisted  of  three  parts-=vocabulary9  mechanical  comprehension9 
and  arithmetical  reasoning .   It  was  felt  that  independent 
verbal j  mechanical,  and  arithmetical  abilities  were  indicated 
by  factorial  analysisc 

■^Bureau  of  Naval  Personnel  Instruction  1236o2920  June 


The  vocabulary  portion  of  the  100  question  test 
consisted  of  50  opposite  items  where  the  testee  was  to 
select  a  word  from  among  five  that  was  nearest  to  being  the 
opposite  of  a  stimulus  wordo  The  Arithmetical  portion  of  the 
test  consisted  of  twenty  questions  having  five  choices  for 
each  questionc  The  Mechanical  Comprehension  test  completed 
the  Qualification  Battery0  This  subtest  consisted  of  thirty 
items  illustrating  mechanical  situations  about  which  a 
question  was  asked  and  an  answer  chosen  from  three  alternatives „ 
Sixty  minutes  was  allowed  to  complete  the  entire  battery  with 
recommended  times  for  each  section „ 

An  Officer  Classification  Test  was  developed  to 
differentiate  among  officers  in  order  that  assignments  could 
be  made  to  specific  duties  with  a  minimum  of  misplacement u 
This  test  battery  is  composed  of  five  sections  as  follows i    (12) 

lo  Verbal  Reasoning  Test     75  five-choice  analysis  items 

II 0  Mechanical  Comprehension  4$  five-choice  mechanical 
Test  comprehension  items 

IIIo  Mathematics  Test         50  five-choice  mathematics 

items 
IV0   Relative  Movement  Test    50  four-choice  relative 

movements  items 
Vc   Spatial  Test 

A,   Block  Assembly       30  four-choice  block 

assembly  items 

Bo  Block  Rotation       30  five-choice  block 

rotation  items 


12 
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CHAPTER  II 
CURRENT  USES  OF  TESTS 

Tests  as  mental  measurement  devices  are  currently 
enjoying  widespread  use.  The  majority  of  psychologists  and 
many  personnel  administrators  feel  that  mental  testing  has 
attained  its  majority  and  is  moving  toward  a  yet  to  be  found 
maturity,,  Many  others  feel  that  mental  testing  has  attained 
its  long  sought  maturity  and  that  the  proper  use  of  mental 
tests  will  benefit  mankind  immeasurably0  A  few  professionals 
in  the  mental  testing  field  feel  that  testing  is  still  in 
its  infancy  and  that  test  results  furnish  only  a  sample  of 
individual  capacities.  These  few  professionals  question 
mental  testing  and  ask  if  mental  tests  do  give  a  fair  and 
effective  measure  of  a  person's  intelligence ,  aptitude , 
knowledge  or  ability  to  think, 

A  listing  of  tests  currently  in  use  with  a  brief 
description  of  each  would  fill  several  volumes 0  Even  a  list 
of  companies  which  furnish  testing  services  would  be  rather 
extensive.   However,  the  five  giants  of  this  industry  are 

(1)  Educational  Testing  Service,  Princeton,  New  Jersey, 

(2)  Psychological  Corporation,  New  York,  Nc  Y„ ,  (3)  Harcourt, 
Brace,  and  World,  Inc.,  (4)  California  Test  Bureau,  Los 
Angeles,  Calif „ ,  and  (5)  Science  Research  Associates ,  Incos 
Chicago,  111, 
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A0   MILITARY  APPLICATION 

Each  of  the  armed  services  has  tests  which  attempt  to 
measure  intelligence ,  aptitude ,  knowledge ,  and  ability  to 
think o  Tests  used  by  the  different  armed  services  are 
similar  and  each  service  uses  similar  procedures  0  However, 
since  each  service  considers  itself  unique  and  therefore 
has  unique  requirements,  each  has  its  own  tests. 

These  tests  are  of  the  pencil  and  paper  type  and  are 
considered  to  be  an  essential  part  of  the  selection  and 
classification  procedures  as  previously  notedo  Tests  are 
usually  given  early  in  an  individual's  military  career „ 
Results  are  tabulated  mechanically  and  posted  in  the  member's 
Service  Record,,  Re=tests  are  allowed  if  it  can  be  shown  by 
the  testee  that  he  was  under  some  severe  handicap  at  the  time 
of  the  original  test. 

Tabulated  test  results  are  used  to  determine  how  and  in 
what  manner  the  newly  inducted  service  members  can  make  their 
greatest  contribution  to  the  mission  of  their  particular 
service. 
Uses  of  Navy  Enlisted  Tests 

Test  results  are  first  used  in  the  case  of  Naval 
enlisted  personnel  at  Recruit  Training  Commands „   During 
their  period  of  recruit  training  but  after  the  basic  test 
battery  has  been  scored  and  entered  in  their  personnel 
record,  they  are  interviewed  individually  by  military 
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personnel  who  are  alleged  to  be  qualified  in  personnel 
placement  worko  At  this  interview ,  careful  consideration  is 
given  to  the  individual's  personal  preference,  test  scores , 
civilian  work  experience ,  motivation,  previous  training ,  and 

general  interests ,   The  recruit  is  then  given  his  first 

13 

classification0 

At  this  time  recommendations  for  Class  A  training  are 
made,  altnough  assignment  to  schools  for  all  recruits 
classified  as  eligible  is  frequently  impossible  due  to  peak 
recruit  inputs,  service  needs,  and  school  capacities 0 

Also  in  the  past,  recruits  wno  scored  low  on  their 
basic  test  battery  were  recommended  for  administrative 
discharge  at  this  point  inasmuch  as  they  were  deemed  not 
capable  oi  being  trained  to  fill  Wavy  billets  <>  However ,  this 
has  been  partially  corrected  by  more  adequate  testing  prior 
to  enlistment  which  is  designed  to  weed  out  individuals 
whose  ability  to  read  and  write  is  suspect » 

The  Navy's  Bureau  of  Naval  Personnel  has  established 
minimum  cutting  scores  on  the  basic  test  battery  for  many 
occupations  requiring  formal  trainingo  These  minimum  scores 
are  widely  disseminated  throughout  the  Navy  and  individuals 
who  have  attained  the  required  status  as  indicated  by  their 
scores  are  elegible  to  apply  for  formal  training  after  they 
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Classification  interviews  are  occasionally  carried  out 

in  a  perfunctory  manner,  allowing  less  than  ten  minutes  per 
interview. 


have  left  recruit  training,.   Individuals  who  apply  for 
Class  A  formal  training  after  recruit  training  are  subject  to 
handicaps,  but  channels  are  available  to  overcome  these 
impediments  if  the  qualified  person  is  persistent „ 

One  of  the  most  severe  handicaps  is  the  constant  shortage 
of  on  board  personnel  in  commands  afloat  and  ashore „   This 
shortage  has  occasionally  been  justified ,  but  more  frequently 
it  is  the  result  of  local  provincialism,.  Whatever  the  reasons 
individual  requests  from  qualified  personnel  frequently  never 
leave  their  respective  commands „ 

Another  handicap  frequently  encountered  in  the  field  is 
a  basic  misunderstanding  as  to  what  test  scores  mean*  For 
example,  a  department  needs  another  striker,  and  a  sailor  is 
selected  based  upon  rather  general  criteria 0  One  of  the 
most  heavily  weighted  factors  considered  is  his  basic  test 
scores,,   If  the  newly  acquired  striker  learns  his  new  job 
rapidly,  he  is  considered  to  be  a  fine  fellow  and  his  test 
scores  were  an  excellent  predictor  of  his  success.  If  the 
striker  learned  the  job  slowly  or  not  at  all9  he  is  considered 
a  smart  never-do-well  and  is  promptly  labeled  as  sueh0  This 
label,  which  is  informally  spread  throughout  the  command, 
adheres  to  this  person  without  dis crimination „  He  is  seldom 
afforded  another  opportunity  to  strike  for  an  occupation 
where  his  talents  could  be  utilized „   In  the  event  this  second 
hypothetical  sailor  has  sufficient  obligated  service  when  a 
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must-quota  for  school  is  sent  to  the  command ,  he  will 
probably  be  made  available ,  but  in  the  meantime  he  may  have 
become  firmly  convinced  that  the  Navy  is  no  place  for  him0 

The  Bureau  of  Naval  Personnel  periodically  issues 
Notices  to  field  activities  stating  personnel  requirements 
and  requesting  applicants  for  various  schools 0  Each  Notice 
or  one  of  its  references  contains  minimum  test  score 
qualifications  for  applicants „  Continuing  requirements  are 
issued  in  the  form  of  instructions 0  A  limited  number  of 
instructions  and  contents  are  outlined  below i 

Bureau  of  Naval  Personnel  Instruction  1510, 69F 
This  instruction  concerns  the  Navy  Enlisted  Scientific 
Education  Program  (NESEP)„  The  (NESEP)  is  an  uninterrupted 
four-year  college  educational  program  lending  to  a 
baccalaureate  degree  in  major  fields  approved  by  the  Chief 8 
Bureau  of  Naval  Personnel,  Upon  graduation  enlisted  personnel 
are  ordered  to  Newport,  Rhode  Island  or  elsewhere  for  Officer 
Candidate  School,  (0CS)o  Upon  successful  completion  of  (0CS)s 
students  are  commissioned  in  the  Regular  Navy0  Eligibility 
requirements  include  a  combined  OCT  and  ARI  score  of  ll£0 
This  ensures  the  Navy  that  the  chances  of  an  individual 
succeeding  in  this  program  are  excellent,,   Other  screening 
examinations  are  given  and  all  necessary  precautions  are 
taken  to  ensure  that  applicants  are  properly  motivated 0 
Similar  prerequisites  are  set  forth  in  the  Bureau  of  Naval 
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Personnel  Notice  1531  of  2#  June  1963  Tor  military  academy 
applicants o 

Bureau  of  Naval  Personnel  Inst. ructjlon  1^00 ol^O 

This  instruction  concerns  the  selection  and  training  of 
candidates  for  diving  duty,.  The  mental  requirements  for 
selection  to  this  program  are  listed  as  desirable  and  consist 
of  a  combined  ARI  and  MECH  score  of  105 » 

Similar  prerequisites  are  established  for  Electronic s 
Clerical 9   and  other  occupations 0   Prerequisites  are  geared  to 
the  level  of  skill  deemed  necessary  to  successfully  function 
in  the  particular  occupation,,   However,  Commanding  Officers 
may  request  test  score  waivers  in  meritorious  cases  where  it 
is  believed  that  a  candidate  does  possess  the  necessary 
capacity  for  training  and  that  this  capacity  is  not  reflected 
in  his  test  scores, 

Efforts  by  the  Chief ,  Bureau  of  Naval  Personnel  to 
establish  minimum  prerequisites  necessary  for  an  individual 
to  attain  proficiency  in  many  areas  have  been  extremely 
successful  as  can  be  judged  by  correlation  studies  between 
test  scores  and  grades  of  students  completing  courses  of 
study o   However,  it  should  be  emphasized  that  these  studies 
included  only  those  students  who  graduated  and  did  not 
include  those  who  were  disenrolled  for  various  reasons  or 
given  certificates  of  completions  The  prerequisites  were 
established  to  reduce  costs  and  to  increase  the  level  of 
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training  in  the  Navy.   It  can  be  safely  assumed  that  these 
two  objectives  have  been  partially  met. 

Efforts  have  also  been  made  by  the  Chief,  Bureau  of 
naval  Personnel  to  enforce  compliance  with  established 
prerequisites.   Bureau  of  Naval  Personnel  Instruction  15 10^7 
of  20  Oct.  1952,  which  is  still  in  effect ,   noted  that 
excessive  numbers  of  ineligible  candidates  were  being 
received  at  enlisted  service  schools.  This  instruction 
directed  the  attention  of  all  commanding  officers  to  the 
problem  and  further  directed  strict  compliance  with  current 
directives.  Failure  to  meet  minimum  basic  battery  test 
scores  was  listed  as  one  of  the  most  frequent  errors  causing 
candidates  to  be  ineligible. 
Uses  of  Navy  Officer  Tests 

The  tests  designed  for  use  in  selecting  and  classifying 
officers  were  outlined  in  Chapter  I.  These  devices  were 
used  during  World  War  II.  The  Officer »s  Selection  Battery 
served  to  screen  out  personnel  who  were  not  deemed  to  be  of 
officer  potential  and  was  given  to  practically  all  officers o 
The  Officer ?s  Classification  Battery  was  not  administered 
to  all  officers  and  many  officers  have  never  taken  this 
series  of  tests,   A  survey  of  ninety-four  officers  having 
from  five  to  eighteen  years  service  at  the  U,  S.  Navy 
Postgraduate  School  in  1962  revealed  only  twenty-one  officers 
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who  had  taken  these  tests.    An  estimate  of  the  number  of 

officers  who  have  taken  the  classification  battery  is  not 
available s  but  these  tests  have  been  regularly  administered 
to  all  newly  commissioned  officers  since  about  1951 « 

A  search  of  publications ,  records ,  and  regulations  at 
the  U0  S„  Naval  Postgraduate  School  does  not  indicate  that 
the  results  of  the  Officer's  Classification  Battery  test 
scores  are  enjoying  wide  use0 

The  Officer's  Selection  Battery  is  enjoying  wide  use0 
All  applicants  for  commissioned  status  are  given  this 
battery0  It  is  given  at  officer  procurement  centers  and  is 
given  annually  to  inservice  applicants „  This  test  battery  is 
an  extremely  basic  instrument  and  furnishes  little  information 
in  addition  to  that  required  for  acceptance  or  rejection© 


lit 

D„  J-  Martz  and  T.  E*  Rushins  "Determination  of  Valid 

Criteria  for  Selecting  Postgraduate  Management  School  Candidates 
on  the  Basis  of  Established  Academic  Performance  and  Various 
Aptitude  Tests"  (Unpublished  research  paper,  U0  So  Naval  Post- 
graduate School,  Monterey,  1962) s  p*  llo 
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CHAPTER  III 
LIMITATIONS  OF  TESTING 

The  limitations  of  testing  in  determining  mental  level , 
general  aptitudes,  and  various  personality  characteristics 
of  individuals  are  in  the  author vs  estimation  presently 
uncountable.   All  of  these  limitations  cannot  be  overcome  in 
the  foreseeable  future.   However,  they  may  be  reduced  to  a 
respectable  level  if  due  recognition  is  given  to  the  facts 0 

A  limited  search  of  the  literature  in  the  field  of 
testing  has  not  revealed  a  common  concise  definition  of 
intelligence  or  intellect,  although  many  testers  and 
psychologists  claim  that  this  is  what  they  are  measuring,, 
Then  we  can  only  conclude  that  intelligence  is  what  intelligence 
tests  measure.   If  this  definition  of  intelligence  is 
accepted,  no  satisfactory  test  of  ability  tc  learn  will  ever 
be  developed c  Tests  currently  measure  intelligence  by  the 
sample  technique,  i.e.  a  performance  sample  is  taken  under 
standardized  conditions „  This  sample  has  actually  been  taken 
from  the  achievements  of  the  individual „   It  has  not  measured 
his  ability  to  learn  which  may  be  far  above  or  far  below  his 
level  of  achievement  as  revealed  by  the  sample. 

Another  limitation  of  testing  is  the  use  of  test  results. 
An  example  previously  given  concerning  the  selection  of  a 
striker  is  one  misuse  that  could  be  easily  corrected.   If 
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test  results  are  treated  as  a  final  measure  of  ability  or 
aptitude,  then  the  test  is  being  misused ,  because  tests 
cannot  furnish  us  with  an  absolute  numerical  measurement  of 
the  individual. 

Ac   TEST  CONSTRUCTION 

Test  construction  is  a  long  and  arduous  process „  A 

decision  must  first  be  made  concerning  the  purpose  of  the 

test,  i.e.  what  abilities ,  proficiencies ,  or  aptitudes  are 

to  be  measured.   In  order  to  do  this»  the  test  maker  must 

have  a  knowledge  of  the  requirements  of  the  particular 

functions  which  the  testees  will  perform,.  He  must  then 

analyze  the  component  abilities,  proficiencies,  or  aptitudes 

which  are  necessary  to  perform  the  stated  function0  The 

test  maker  then  prepares  a  large  number  of  questions,  almost 

always  of  the  multiple  choice  variety  for  intelligence  tests, 

to  be  used  in  the  initial  stage.  Then  a  weeding-out  process 

begins.  The  test  maker  may  reject  many  of  the  questions  and 

reword  others  at  this  stage „ 

The  surviving  questions  are  then  "pretested"  on  people 
comparable  to  those  for  whom  the  test  is  intended,  and 
a  statistical  dossier  is  compiled  for  each  question,.  If 
a  question  is  answered  correctly  mainly  by  the  "better" 
examinees  it  is  a  good  question 0   If  it  is  answered 
correctly  mainly  by  the  "poorer"  ones  it  is  a  bad 
question.   If  a  fair  number  of  the  "better"  examinees 
favor  one  answer  and  a  comoarable  number  favor  another, 
the  question  is  probably  ambiguous 0o  If  everyone  gets 
it  right,  it  is  useless 0   And  so  onQ 
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In  the  liftfit  of  pretest  statistics ,  still  further 
questions  are  rejected  or  rewritten,  and  ultimately  a 
rigorously  screened  version  of  the  test  emerges „   It  is 
now  ready  to  be  given  to  the  people  for  whom  it  was 
constructed o  ,.0The  test  is  given  a  preliminary  try 
out  and  the  results  receive  elaborate  statistical 
analysis .15 

At  the  time  when  original  construction  begins  the  test 
maker  decides  what  salient  characteristics  testees  must 
possess  in  order  to  perform  the  job  for  which  the  test  is  to 
be  given.  These  characteristics  may  x'-ary  in  quantity »  but 
are  usually  small  in  number..   Original  questions  are  selected 
to  measure  each  of  these  characteristics  and  hopefully 
through  the  above  quoted  procedure  the  finished  test  in  its 
smooth  form  will  furnish  the  test  user  with  sufficient 
information  which  will  allow  him  to  make  a  better  personnel 
decision  than  he  could  have  made  without  the  test* 

In  constructing  the  test,  every  possible  aspect  has 
been  standardized .   Standardized  time,  room  temperature  and 
lighting  are  desireable*  Timing  is  considered  particularly 
important  inasmuch  as  this  helps  to  weed  out  testees  who  are 
not  familiar  with  the  subject  matter  of  questions  and  are 
slow  in  coming  up  with  an  answer.   This  also  saves  time  on 
the  part  of  the  tester  and  the  testee0 


15 

Banesh  Hoffman,  "The  Tyranny  of  Multiple-Choice  Tests" 9 
Harper *s  Magazine.  CCXXII  No.  1330  (March,  1961),  p0  3&0 
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Another  aspect  of  standardization  in  tests  is  the  answers 
Questions  of  the  multiple  choice  variety  usually  request  that 
the  correct  answer  be  chosen  out  of  3S  4  or  5  possible  answers 
or  that  the  best  answer  be  chosen  by  the  testee,  When  the 
test  is  pretested s  standardized  answers  are  selected  by  the 
test  maker  on  the  basis  of  the  most  successful  examinees 
answers.   Answers  thus  obtained  are^  of  course s  subjected  to 
the  most  severe  statistical  analysis .  Assurances  can  then 
be  given  without  reservation  that  the  standardized  answer  for 
each  question  is  significant  at  a  particular  level 0   Since 
we  have  predetermined  answers  for  questions,  test  grading  is 
a  very  simple  matter  requiring  no  judgement 0  Where  large 
numbers  of  tests  are  involved ,  grading  by  machine  is  the 
least  expensive  and  most  accurate  method  of  determining  test 
scores.   This  is  true  of  all  "objective"  type  standardized 
tests. 

Objective  type  multiple  choice  tests  are  generally 
thought  to  be  of  very  high  caliber  inasmuch  as  the  margin  for 
human  error  has  been  largely  removed  from  well  constructed 
tests.   This  is  a  possible  error  in  test  construction . 
Individuals  who  take  tests  can  only  answer  questions  sub- 
jectively, i.e.  within  the  framework  of  their  own  experiences , 
achievements  and  judgement.  Mass  testing  with  predetermined 
answers  accentuates  previous  experiences  and  achievements. 
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Little  stress  is  placed  on  the  judgement  of  the  individual 0 
Justification  for  this  emphasis  on  experience  and  achievement 
is  readily  apparent.   The  individual  with  the  desired  level 
of  intelligence  or  judgement  will  have  had  experiences 
similar  to  others  in  society.,  Therefore  his  current  judgement 
or  reasoning  ability  is  a  direct  result  of  his  past  achieve- 
ments and  experiences.   If  he  has  no  experience  in  an  area 
being  tested,  he  will  be  scored  low  by  the  machine,  because 
he  was  not  standard  and  didn?t  produce  standard  answers*, 

In  some  cases  the  testee  is  penalized  for  using 
judgement.   An  example  of  a  sentence  completion  item  from  the 
Navyvs  General  Classification  Test  will  reveal  this0 

A  good  sailor  wil3  ________  "the  orders  of  his  superior 

officer. 

(A)  see 

(B)  fear 

(C)  question 

(D)  obey 

(E)  change 

It  is  presumed  that  this  question  is  no  longer  in  use9 
if  it  was  ever  used,  but  it  is  felt  that  it  is  representative 
of  many  completion,  (choose  the  best  answer) ,  type  questions „ 
This  question  or  a  similar  question  is  administered  to 
recruits  after  a  period  of  recruit  training 0   During  trainings 
conformity  in  thought  and  actions  is  a  desirable  behavior 
pattern.   Lectures  laud  the  life  of  a  good  sailor  and  the 
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merits  of  obedience  are  extolled.  Movies  of  the  Navy9s 
great  accomplishments  are  shown  and  to  each  order  issued  by 
a  superior  officer,  good  sailors  have  resounded  with  an 
"Aye,  Aye,  Sir"  meaning  that  the  good  sailor  has  received 
the  order,  he  comprehends  the  order,  and  he  will  obey  the 
order .   The  key  word  then  becomes  "obey",  but  is  this  the 
desired  objective  answer?  Most  sailors  will  undoubtedly  choose 
"obey"  to  complete  this  sentence 0   A  few  will  choose  answers 
B,  C,  and  E  because  they  don^t  understand  the  question „  or 
they  have  psycholorical  incapacities 0   But  the  sailor  who  is 
attempting  to  use  judgement  is  at  an  impasse 0   He  knows  that 
a  good  sailor  must  receive  an  order ,  comprehend  it,  and  then 
obey  it.   This  sailor  knows  that  the  statement  implies  orders 
have  been  issued .   Further  he  knows  that  to  see  or  comprehend 
an  order  is  an  absolute  prerequisite  to  obeying,,   Should  a 
good  sailor  comprehend  all  orders  given  by  superiors  or 
should  a  good  sailor  obey  orders  received  without  question, 
whether  he  understands  them  or  not?  He  may  follow  a  logical 
sequence  and  give  see  as  his  answer  or  he  may  try  to  figure 
our  what  answer  the  test  maker  wants  and  give  obeyc   In 

either  case  he  has  been  left  far  behind  other  testees  and 

17 
may  not  finish  the  test  in  standard  time. 

'The  objective  answer  to  this  question  is  unknown. 
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Other  examples  of  objective  type  questions  that  might 
be  deemed  confusing;,  vague,  misleading  or  ambiguous  can  be 
found  in  many  tests  in  use  tod ay 0   The  California  Survey  of 
Mental  Maturity,  Form  Ij.  ^s  considered  by  many  authorities 
to  be  a  test  of  considerable  merit.   This  is  a  multiple 
choice  objective  tyDe  test  divided  into  language  and  non- 
language  sections  with  several  subsections  in  each  section,, 
One  subsection  of  the  la. nguage  section  on  page  5S  left 
column,  states:   "In  each  row,  there  is  one  picture  that  shows 
something  which  is  the  opposite  of  the  first  picture „  Mark 
its  number.  (Items  23-27)"^°  Question  number  25  gives  a 
picture  of  falling  rain  in  a  wooded  area  as  a  first  picture 
and  as  its  possible  opposites,  there  are  pictures  of  (1) 
an  exploding  stick  of  dynamite;  (2)  a  geyser  spewing  into 
the  air;  (3)  a  water  fountain  sprinkling  water  into  the  air; 
and  (4)  a  mountain  stream,,   The  correct  objective  answer  to 
this  question  is  number  4  possibly  because  the  test  makers 
thought  that  a  mountain  stream  was  not  violent,   A  non-random 
sample  of  seven  testees  of  high  intellect  chose  number  1  as 
the  correct  answer  and  all  because  the  other  three  choices 
contained  moving  water « 

The  language  section  of  this  test  contains  a  question , 

number  2,  considered  to  be  more  defective  than  number  25 o   For 

19 

this  subsection,  instructions  tell  us  to:    "Mark  the  number 
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W0  W.  Clark,  et  al« ,  Survey,  of  Mental  Maturity  Form  1 

(Los  Angeles?  California  Test  Bureau,  i959T~Po  5° 
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of  the  v/ord  that  moans  the  same  or  about  the  same  as  the 
first  wordo"  The  first  word  listed  is  oppress „  The  possible 
choices  are  listed  as;   promise,  imitate,  crowd,  and  burden 0 
It  was  not  deemed  necessary  to  test  this  Question*   In  this 
case,  a  review  of  a  current  English  dictionary  by  the  test 
makers  would  have  given  cause  to  remove  the  question  from 

the  testo   Both  to  crowd  and  to  burden  are  listed  as  correct 
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meanings  for  oppress .    However,  the  objective  answer  to 

this  question  is  burden <> 

For  an  excellent  analysis  of  multiple  choice  questions 
with  objective  answers,  readers  are  invited  to  consult 
The  Tyranny  of  Testing  by  Banesh  Hoffmann 0   Dr.  Hoffmann  has 
made  a  comprehensive  study  of  testing  and  estimates  that  as 
many  as  5  percent  of  the  questions  used  in  our  best  tests  are 
defective .   He  has  taken  sn  analytical  approach  in  his 
study  that  may  help  to  improve  testing . 

B.   RELIABILITY  OF  TESTS 

It  is  often  stated  by  test  makers  that  a  test  cannot  be 
valid  unless  it  is  realiable*   Reliability,  quite  simply 9 
refers  to  consistency  of  results „   "In  theory  if  an  individual 
were  to  take  a  test  three  or  four  times  he  would  answer  each 
question  the  same  way  and  would  come  up  with  the  same  score0,?^ 


20C.  T0  Onions  (ed.),  The  Oxford  Universal  Dictionar- 


(New  York:  Rand  McNally  &  Company,  1955),  P<>  1377. 
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Rossall  J.  Johnson,  Personnel  and  Industrial  Relations 

(Homewood,  Ill8:   Richard  Do  Irwin,  Inc,  I960),  ppG  50=51. 
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This  is  ideal  realiability  and  unforunately  seldom  happens « 
In  fact,  scores  usually  improve  each  time  a  test  is  taken 
and  may  improve  considerably  if  the  individual  has  taken  other 
tests  of  the  same  type,  even  on  subject  material  unrelated 
to  that  of  the  original  teste   This  is  the  test-retest  method 
of  measuring  reliability  and  it  is  not  often  used,  for  reasons 
which  are  obvious  from  the  above  discussion.   However,  memory 
traces  which  cause  improvement  each  time  a  test  is  taken  tend 
to  fade  with  time  and  better  reliability  is  found  in  using  the 
test-retest  method  when  a  time  span  is  allowed  between  tests 0 
Since  individuals  are  continually  learning,  the  time  span 
allows  a  person  to  acquire  new  knowledge  which  interferes  with 
our  reliability  test.   For  all  of  these  reasons s  the  test- 
retest  method  is  less  than  satisfactory 0 

Two  other  methods  of  testing  for  reliability  commonly 
in  use  are  the  equivalent  form  and  split  halves  methods •   in 
the  equivalent  form,  two  tests  are  developed  of  equai  difficulty 
covering  the  same  subject .   If  these  tests  are  identical , 
scores  on  the  tests  are  identical,.   Since  any  two  questions 
are  never  identical,  reliability  must  be  estimated,  but  fair 
estimates  can  be  obtained  in  this  manner •  The  split  halves 
method  is  a  variation  of  the  equivalent  form  method.   In  the 
latter  method  two  tests  are  developed,  while  in  the  former s  a 
single  test  is  split  into  two  parts  of  equal  difficulty  and 
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the  halves  are  measured  against  each  other  and  then  by 
statistical  massage  test  reliability  can  be  determined.  The 
Navy  uses  both  of  these  methods  for  developing  test 
reliability  data  by  using  each  method  singly  or  combining  the 
two  methods  in  some  cases 0 

C.   VALIDITY  OF  TESTS 

Previously  it  has  been  noted  that  the  test  maker  must 
determine  what  component  abilities,  proficiencies  and 
aptitudes  must  be  possessed  to  perform  a  given  function „ 
Tests  are  considered  valid  if  they  measure  these  components 
accurately.   However,  in  addition,  before  tests  are 
considered  valid ,  it  must  be  proven  that  the  original  analysis 
is  correct,  i„e0  a  test  which  measures  verbal  ability  may 
have  high  validity  in  measuring  verbal  ability^  but  the  same 
test  may  have  very  low  or  zero  validity  v/hen  correlated 
with  job  success. 

The  ideal  is  seldom  found  and  tests  are  considered 
beneficial  if  a  positive  correlation  exists,,   The  greater  the 
coefficient  of  correlation,  the  better  the  test.   Frequently 
very  low  correlations  are  sufficient  to  weed  out  personnel 
who  are  obviously  not  qualified  to  perform  a  given  function „ 

Determining  test  validity  is  an  extremely  difficult 

tasko   first  a  job  analysis  is  necessary s  then  the  criteria 

for  success  must  be  established.   Once  the  criteria  has  been 

established,  a  grading  or  scale  for  assessing  job  success  for 

each  individual  is  necessary.   Only  then  can  the  validity  of 

a  test  be  found  by  matching  job  success  with  test  scores 0 
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CHAPTER  IV 
NECESSITY  FOR  TEST  IMPROVEMENT 

Testing  as  a  means  of  mental  measurement  can  be  an 
excellent  tool  in  our  military  arsenal „   It  is  not  sufficient, 
to  maintain  this  tool  in  a  static  state  v/hen  conditions  and 
needs  are  dynamic.  The  present  state  of  testing  can  be 
likened  to  a  ship  which  was  built  in  1944  and  has  been  kept 
in  an  excellent  state  of  repair*   In  many  respects  this  ship 
can  still  fill  a  vital  role  in  the  Navyl]s  mission  just  as 
testing  assists  in  classifying ,  training,  and  placing  personnel, 
Improvements  have  been  made  in  old  weapon  systems  and  new 
ones,  through  research,  have  been  developed,.   Old  tests  have 
been  improved;  at  least  statistics  tell  us  that  test 
reliability  and  validity  is  improving  with  each  test  revision » 

A0   CONSTRUCTION 

Test  construction  has  previously  been  discussed  in  broad 
outline ■   Several  defects  have  been  pointed  out  and  other 
defects  implied 0  These  defects  in  total,  if  Dru  Hoffmann?s 
estimate  can  be  accepted ,  would  allow  the  less  intelligent 
individual  with  superficial  knowledge  to  obtain  a  raw  score 
five  percent  higher  than  his  more  intelligent  contemporary 9 
although  the  probability  of  an  extreme  of  this  sort  is 
quite  low  and  waivers  can  usually  be  obtained  if  an  applicant 
for  any  program  has  persistence.   However,  the  applicants 
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persistence  must  not  waiver  while  he  is  convincing  his 
Division  Petty  Officer,,  Division  Officer,  Department  Head, 
and  Commanding  Officer  of  his  sincerity*, 

Questions  which  are  constructed  using  flawless  grammer 
with  the  best  answer  requested  from  a  choice  of  several 
alternatives  are  at  best  suspect  when  more  than  one  correct 
choice  may  be  interpreted.   There  is  considerably  certainty 
that  we  will  obtain  answers  to  questions  of  this  type  which 
show  us  a  normal  distribution ,  i0ea  after  the  question  has 
completed  the  cyclical  test  for  reliability0  This  distribution 
is  obtained  through  careful  study  of  answers  given  and  answer  s 
certainly  reflect  the  experience ,  education ,  and  achievements s 
or  lack  of  same  factors ,  of  persons  answering  the  que st ions 0 
The  author  has  been  unable  to  locate  any  relevant  studies 
which  attempt  an  analysis  as  to  why  distracter  answers  are 
chosen  by  testees  taking  multiple-choice  objective  type 
tests  or  for  that  matter  why  the  objective  answer  is  chosen 
by  testees .   It  is  felt  that  this  information  is  an 
absolute  necessity  before  questions  of  this  type  can  be 
clearly  evaluated  and  used  as  a  measuring  device 0 

There  is,  of  course,  no  excuse  for  constructing  questions 
which  tend  to  mislead ■   This  is  a  favorite  method  of  many 
college  professors  who  test  for  rote  memorization,,  This  type 
of  question  is  not  only  incorrect,  it  is  a  discredit  to  the 
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intellect  of  man.   In  the  category  of  questions  which  tend 
to  mislead,  we  must  include  all  questions  which  are  vague , 
ambiguous,  and  in  general  tasteless •  Questions  of  the 
misleading  variety  may  serve  a  valid  purpose  when  used  by 
experts  in  individual  testing,  but  their  usefulness  is 
marginal  when  in  group  testing  we  attempt  to  measure  the 
intellect  of  a  particular  individual ,  although  tests  of  this 
type  are  beneficial  if  it  is  desired  to  measure  one9s 
ability  to  detect  flaws  in  construction „ 

There  are  very  few  questions  of  the  types  described 
above  in  use  by  military  testers,  but  any  is  too  manye   These 
questions  are  a  twofold  detriment  to  sound  testing  because 
the  testee  must  first  determine  among  many  variables  what 
the  question  is  requesting  and  then  select  an  answer  from 
several  possible  objective  answers.   The  total  possibilities 
in  a  poorly  constructed  question  can  be  astronomical  in 
number .   Perhaps  probabilities  could  be  assigned  to  each 
possibility,  but  this  would  bring  the  testee  no  closer  to 
comprehending  the  question  than  before  and  his  answer  v/ould, 
largely,  still  be  left  to  chance u 

Bo   VALIDITY 

It  is  presumed  in  this  section  that  military  tests  are 
well  constructed  a  They  are  highly  reliable  and  reliability 
tests  show  a  correlation  coefficient  of  o£0  or  greater „ 
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The  concept  of  validity  is  crucial  to  any  testing  program0 
If  a  test  is  perfectly  valids  it  has  a  correlation 
coefficient  of  plus  1,  iceo  the  level  of  performance  of 
each  worker  is  identical  to  his  test  score  in  relation  to 
the  group  being  tested „   Perfect  validation  is  illustrated 
in  Appendix  B,   At  the  other  extreme ,  a  test  may  have  a 
perfectly  negative  correlation  coefficient  of  minus  1  where 
the  individuals  who  obtain  the  lowest  scores  are  the  best 
workers.  This  is  also  illustrated  in  Appendix  B,   In  the 
event  there  is  no  relationship  between  test  scores  and  work 
performance,  a  zero  correlation  coefficient ,  also  illustrated 
in  Appendix  B,  is  said  to  exist*   Tests  with  a  zero  correlation 
coefficient  are  considered  to  have  little  merit s  while  those 
having  a  positive  or  negative  correlation  can  be  usedu 
However,  in  choosing  workers  by  using  a  test  having  a  negative 
correlation  with  job  success ,  it  must  be  remembered  that 
low  scores  mean  that  the  worker  will  be  a  success  on  the  job 
for  which  the  test  was  developed* 

The  Navy's  studies  of  test  validity  have  been  quite 
extensive  within  a  limited  range •   Available  studies  indicate 
that  the  area  of  coverage  has  been  limited  wholly  to 
academic  performance.  This  has  been  necessary  because  the 
Navy  has  not  yet  developed  an  adequate  system  of  rating 
officers  and  men  in  job  performance  outside  the  training 
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area.   Some  controversy  erupts  from  time  to  time  between 
proponents  of  various  ratine;  methods  however,  but  no  system 
has  been  officially  adopted  which  even  purports  to  solve 
this  conumdrunu   Therefore,  since  all  military  personnel 
receive  some  training,  the  criteria  for  success  hinges  on 
academic  performance  in  training  assignments, 

Stuit's^  studies  of  test  validity  show  a  positive 
correlation  between  scholastic  achievement  and  test  scores 
for  most  Navy  tests  used  in  classifying  both  officers  and 
enlisted  personnel  in  World  War  II „   He  succinctly  points 
out  instances  of  negative  correlation,  but  these  are  small 
in  number  and  can  be  disregarded  „   For-  the  most  part 
correlation  coefficients  fell  in  the  range  o10  to  0?0o   Any 
coefficient  above  „60  is  considered  very  high. 

A  more  recent  study  of  test  validity  revealed  that  there 
was  a  significant  positive  relationship  between  the  (BTB) 
for  enlisted  personnel  and  final  grades  attained  at  class  A 
and  class  P  Navy  schools 9^   Various  combination  of  the  Basic 
Test  Battery  scores  were  used  in  this  study.  These  same 
test  score  combinations  had  previously  been  used  in  assigning 
personnel  to  school » 
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Stuit ,  oj}0  cit. ,  et  passim, 

^Research  Report  £Z^L  NAVPERS  1S344A,  Revised  Edition, 
Personnel  Measurement  Research  Branch ,  Personnel  Analysis 
Division,  Bureau  of  Naval  Personnel,  April  1957,  et  passim. 
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In  general,  service  schools  in  the  Wavy  are  under  the 
control  of  the  Bureau  of  Naval  Personnel 0  Service  schools 
are  established  as  satellite  commands  in  a  larger  complex, 
independent  commands,  and  as  school  commands  where  schools 
of  several  types  are  established  under  one  commanding  officer* 
Training  programs  are  established  by  the  Bureau  of  Naval 
Personnel  in  conjunction  with  a  technical  bureau  having 
primary  responsibility  in  the  area  concerned „   The  Navy's 
need  for  training  personnel  is  determined  by  the  Bureau  of 
Naval  Personnel  again  in  conjunction  with  the  technical 
bureau  concerned.   Quotas  are  established  and  personnel  are 
selected  and  assigned  to  the  various  established  schools • 
These  assignments  are  based  upon  service  needs,  test  scores, 
and  individual  preference.,   Training  commands,  at  this 
point,  have  an  approved  training  program  and  trainable 
students  and  these  commands  are  expected  to  train  and 
graduate  men  who  are  capable  of  performing  technical  service 
in  today's  Navy  of  ever  increasing  complexity,. 

There  are  indications  that  service  schools  labor  under 
some  handicaps  in  fulfilling  their  missions,.   Standards  must 
be  set  as  a  goal  for  students.   At  the  same  time  personnel 
requirements  must  be  considered,  so  standards  must  not  be 
too  high  to  prevent  the  required  number  from  completing 
training.   Standards  among  schools  training  personnel  for 
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the  same  technical  specialty  do  not  vary,  but  commanding 

officers  who  apply  these  standards  in  too  rigorous  a  manner 

may  be  subject  to  severe  criticism.   In  situations  where  a 

school  is  training  personnel  who  are  not  meeting  standard, 

grading  may  have  to  be  revised  according  to  the  study  quoted 

be  low . 

Validity  studies  made  under  circumstances  where  true 

performance  is  unknown  are  tenuous .   In  addition,  school 

operating  personnel  are  placed  in  a  rather  difficult 

position  in  meeting  retirements  of  quality  and  quantity,,  A 

paragraph  from  the  study  noted  above  does  little  to  instill 

confidence  in  the  Navy's  studies  of  test  validity 0  This 

paragraph  is  quoted  as  follows;  ^ 

(Usually  the  validity  coefficients  presented  for 
tv/o  class  "A"  schools  training  men  for  the  same 
ratings  are  of  comparable  magnitude.,   However,  in 
a  few  cases  there  are  wide  disparities 0   In  these 
cases,  for  the  schools  with  the  much  lower  validities, 
the  grading  system  might  well  be  reviewed,  since 
criterion  unreliability  is  one  of  the  factors  which 
often  reduce  the  obtained  validities  of  aptitude  tests.) 

We  must,  of  course,  agree  that  criterion  reliability 

is  an  absolute  necessity  if  we  are  to  obtain  reasonably 

correct  validity  coefficients,  but  the  line  of  action  proposed 

here  would  only  increase  the  validity  coefficient  and  may  not 

correct  it  at  all.   In  a  military  complex,  a  review  has  many 
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connotations  and  to  single  out  a  school  and  suggest  that  its 

grading  system  be  revised  because  validity  studies  do  not 

compare  favorably  is  tantamount  to  censure «,   If  some 

disparity  does  exist,  an  examination  is  certainly  indie ateds 

but  in  checking  for  criterion  reliability,  we  should  be  a 

bit  more  scientific  and  review  the  grading  systems  of  all 

schools  having  the  same  mission. 

A  study  by  Thorndike  and  Hagen  of  more  than  ten  thousand 

men  who  had  previously  taken  military  test  batteries  was 

completed  and  published  in  1959 o  Several  limitations  were 

recognized  by  the  authors  of  this  study  in  reaching  their 

conclusions  on  the  validity  of  aptitude  tests  as  a  predictor 

of  job  success  in  civilian  occupations .   All  men  studied  were 

gainfully  employed  in  various  jobs  of  their  own  choice „   It 

is  stated.  5 

"In  general  conclusion,  we  must  say  that  though  it  is 
possible  that  tests  of  aptitude  can  show  validity  in 
long-range  predictions  of  occupational  success  when 
individuals  are  employed  in  jobs  in  widely  different 
parts  of  the  country,  our  data  give  little  evidence  to 
encourage  this  belief . " 

Conclusions  and  results  are  succinctly  stated  as  follows  - 
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R0  L.  Thorndike  and  E,  Hagen,  Ten  Thousand  Careers 
(New  York:  John  Wiley  and  Sons,  Inc.,  1959),  P«  /J" 

26 

Ibid,  p.  50c 
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Our  results  showed  that  occupational  groups 
differed  with  respect  to  personal  background  variables 
as  well  as  with  respect  to  aptitude  test  scores ,   It 
is  hard  to  make  a  quantitative  comparison  between 
these  two  types  of  information,  but  our  judgement  would 
be  that  items  of  personal  background  differentiated 
about  as  sharply  as  did  scores  on  aptitude  tests „ 
Once  again,  the  patterns  were,  in  most  instances , 
sensible  and  in  accord  with  what  we  would  have  expected 
by  a  priori  analysis  of  the  occupations .   It  is  possible 
to  rationalize  most  of  the  significant  differences  with 
some  satisfaction.   There  were,  of  course s  some 
differences  that  are  difficult  to  rationalize,  but 
these  can,  in  many  instances,  be  thought  of  as  chance 
variations  and  onces  that  probably  would  not  hold  up 
in  another  sample. 

With  respect  to  prediction  of  success  within  an 
occupation,  our  conclusions  must  be  quite  different. 
As  far  as  we  were  able  to  determine  from  our  datas  there 
is  no  convincing  evidence  that  aptitude  tests  or 
biographical  information  of  the  type  that  was  available 
to  us  can  predict  degree  of  success  within  an 
occupation  insofar  as  this  is  represented  in  the 
criterion  measures  that  we  were  able  to  obtain.   This 
would  suggest  that  we  should  view  the  long-range 
prediction  of  occupational  success  by  aptitude  tests 
with  a  good  deal  of  skepticism  and  take  a  very  restrained 
view  as  to  how  much  can  be  accomplished  in  this  direction. 
It  is  possible  that  data  for  a  more  heterogeneous  group 
of  applicants  would  lead  to  different  conclusions  in 
this  respect;  however,  our  suspicion  is  that  if  the  group 
had  been  more  heterogeneous,  our  increased  success  would 
have  shown  up  primarily  in  an  increased  sharpness  of 
differentiation  among  occupations  rather  than  in  improved 
ability  to  predict  within  a  single  occupation.   Certainly 9 
if  we  had  taken  the  whole  range  of  abilities  in  the 
American  population,  the  profile  patterns  would  have 
become  very  much  more  clear-cut  and  the  differences 
among  occupations  would  have  become  a  good  deal  more 
striking.   Whether  at  the  same  time  we  would  have  developed 
some  success  at  predicting  degrees  of  achievement  within 
an  occupation  seems  very  much  open  to  question. 
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The  group  involved  in  this  study  was  limited  to 
former  Army  Air  Force  Cadets „   Tests  used  were  of  the  general 
type  previously  described  as  being  administered  to  officer 
personnel  for  the  purpose  of  classification   It  is  felt 
that  the  results,  as  outlined  by  Thorndike  and  Hagen5  speak 
for  themselves  and  the  subject  requires  no  further  comment 
at  this  point. 
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CHAPTER  V 
TOOLS  FOR  THE  FUTURE 

The  United  States  Navy  and  other  military  services 
have  throughout  the  history  of  the  United  States  served 
their  country  well  in  both  war  and  peace ,   However,  never 
before  have  the  military  services  been  called  upon  to 
prepare  for  instantaneous  defense  of  their  nation  on  a 
frlobal  scale o   This  calling  has  necessitated  a  peace-time 
build-up  of  men  and  materials  beyond  the  comprehension  of 
our  civilian  and  military  leaders  in  World  War  II „ 

In  an  effort  to  minimize  the  cost  of  the  defense  effort, 
thus  lessening  the  military  drain  on  the  National  Economy, 
civilian  and  military  leaders  have  concentrated  their  efforts 
on  the  spectacular,  i.e.  areas  of  high  dollar  costo   Efforts 
in  these  areas  have  certainly  given  us  more  tang  for  the 
bucko   The  art  of  Operational  Analysis  has  been  introduced 
and  promises  to  be  extremely  useful  in  lowering  costs  and 
increasing  efficiency .   All  new,  as  well  as  old,  projects 
are  scrutinized  to  determine  if  they  permit  optimum  use; 
reduce  costs;  have  sufficiently  low  costs;  increase  the 
speed  of;  are  capable  of;  promote  and  conserve;  are 
compatible  with;  maximize  output;  and  a  myriad  of  other 
catch  phrases  meaning  the  same  thing— get  the  most  for  the 
military  dollar 0 
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Billions  of  dollars  have  been  allocated  and  spent  for 
research  and  development  of  weapons  and  systems  which 
are  deemed  necessary  for  national  defense 0  Many  more  billions 
have  been  spent  maintaining  and  operating  these  weapons 
and  systems. 

In  FY  1962 j  the  Navy  spent  approximately  2.7  billion 
dollars  on  military  manpower 0   A  very  small  fraction  of  this 
amount  was  allocated  to  personnel  utilization  research.,  ' 
We  have  definitely  increased  our  repertory  of  tools  necessary 
for  the  future,  but,  to  a  large  extent ,  the  tools  necessary 
for  the  proper  utilization  of  manpower  have  yet  to  be 
fabricated. 

A,   RESEARCH  REQUIRED 

Much  research  has  already  been  accomplished,  but  our 
knowledge  of  man  is  extremely  limited.   The  general  educational 
level  of  a  person  can  be  obtained  by  a  simple  pencil  and  paper 
test,  but  our  knowledge  of  individual  capacities  must  be 
increased  and  put  to  use.   For  example,  aptitude  is  defined 
as: 28  »a  condition  or  set  of  characteristics  regarded  as 


2'Dollar  costs  for  this  program  were  not  available  in 
the  Office  of  the  Navy  Comproller  or  in  the  Bureau  of  Naval 
Personnel.   It  is  presumed  that  information  of  this  type 
would  be  extremely  difficult  to  obtain  with  the  accounting 
system  currently  in  use. 

*°E.  L.  Hartley  and  R0  E0  Hartley,  Outside  Readings  in 
Psychology,  (New  York:  Thomas  Y0  Crowell  Company^"  1957), 
p.  274 0 
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symptomatic  of  an  individual's  ability  to  acquire  with 
training  some  (usually  specified)  knowledge,  skill,  or  set 
of  responses,  such  as  ability  to  speak  a  language,  to  produce 
music  . . 

This  is  a  broad  definition  and  probably  fairly  accurate 
because  it  is  a  wide-angle  approach  to  apptitude0   It  is  to 
be  noted  that  knowledge  in  a  specified  area  is  not  a  necessary 
prereouisite  to  being  trained  in  that  area.   According  to 
many  learning  theorists,  learning  is  accomplished  most 
rapidly  when  there  is  no  interference  from  already  acquired 
knowledge. 

The  Navy's  test  for  mechanical  aptitude  serves  to 
illustrate  that  there  may  be  little  relationship  between 
previously  acouired  mechanical  experience  and  an  aptitude  for 
learning  mechanical  skills .   Validity  studies  for  this  test 
normally  reveal  low  correlation  coefficients  because  we  do 
not  know  what  characteristics  or  abilities  are  renuired  to 
learn  a  mechanical  skill.   According  to  the  study  by  Thorndike 
and  Hagen,  previously  ouoted,  backgrounds  differentiated 
between  occupations  as  sharply  as  did  aptitude  test  scores. 
This  then  appears  to  be  an  area  that  reouires  considerable 
basic  and  applied  research. 

It  is  not  felt  that  research  of  the  type  alluded  to  in 
the  Previous  paragraph  should  be  performed  within  the  military 
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establishment  inasmuch  as  personnel  who  are  in  ratings, 
specialities  or  occupations  at  present  are  probably  not 
representative  of  a  population  which  seeks  its  own  level  in 
society.  Most  military  enlisted  billets  are  filled  by 
personnel  who  were  considered  trainable  in  a  particular 
speciality  at  an  early  stage  of  their  military  service  by 
virtue  of  their  test  scores.   Studies  of  this  group  have 
vindicated  past  procedures  and  will  certainly  do  so  in  the 
future,  but  will  furnish  little  usable  data.  Many  military 
specialities  are,  of  course,  not  found  in  use  in  the 
civilian  economy  nor  will  a  military  environment  be  frequently 
found,  but  these  superficial  handicaps  will  for  practical 
purposes  disappear  when  they  are  carefully  examined. 

B.   SELECTION  FOR  TRAINING 

Chapter  II  briefly  outlined  the  manner  in  which  tests 
are  currently  used  for  selecting  enlisted  personnel  for 
training*   At  that  point  it  was  noted  that  the  Officer 
Classification  Battery  (OCB)  was  not  enjoying  wide  use  as  a 
selection  device.   It  was  further  shown  in  Chapter  IV  that 
the  (OCB),  in  a  study  by  Thorndike  and  Hagen,  may  have  little 
validity  as  a  predicter  of  what  occupation  will  be  chosen 
by  the  individual  and  less  validity  as  a  predictor  of 
success. 
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The  Superintendent,  United  States  Naval  Postgraduate 
School , 29  by  inference,  agrees  that  the  (OCB)  is  an  instrument 
of  limited  usefulness ,   In  a  letter,  Sen  2166  dated  2  Aug  1963 
to  the  Chief  of  Naval  Personnel,  the  Superintendent  set  forth 
his  recommended  guidelines  for  the  Postgraduate  Selection 
Board's  use  in  selecting  students  for  postgraduate  study 
during  academic  year  1964-1965 «   These  recommendations  were 
straight-forward  and  pertinent,  but  there  was  no  mention  of 
the  Officer  Classification  Battery* 

Due  to  the  diverse  backgrounds  of  the  several  thousand 
officers  considered  for  postgraduate  study,  some  common 
attribute  that  could  be  used  as  a  predictor  of  academic  success 
was  needed.   This  was  essentially  revolved  by  considering  the 
officer *s  background  as  reflected  in  his  personnel  record  on 
file  in  the  Bureau  of  Naval  Personnel 0   Each  officer  had  on 
file  fitness  reports  from  which  the  Selection  Board  could 
determine  the  level  of  his  past  performance,  for  the  most 
part,  in  non-academic  assignments .   The  Selection  Board  also 
had  available  academic  transcripts  of  undergraduate  education 
from  several  hundred  colleges  and  universities.   The  criterion 
for  assigning  grades  in  many  of  these  schools  was  unknown 0 
The  direct  cost  of  selection  by  this  method  is  not  insignificant 
and  the  opportunity  costs  can  be  appallingo 
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This  is  the  largest  institution  of  its  type  in  the  world 

and  its  primary  mission  is  the  postgraduate  education  of  Naval 
Officers. 
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Economics  of  Test  in  p; 

Tests  used  by  the  military  services  as  entrance 
screening  devices  in  peace-time  save  the  tax-payers  from  an 
unnecessary  burden  in  two  ways.  First ,  monies  are  not 
vrasted  in  attempting  to  train  personnel  who  do  not  have  the 
requisite  capacities  for  military  service  and  second  the 
total  efficiency  of  the  military  organization  is  increased  by 
eliminating  the  possibility  of  non°trainabie  personnel  acting 
as  a  drag  in  an  otherwise  smooth-running  organization . 
Entrance  standards  have  been  low  in  the  past  and  perhaps 
will  be  lower  in  the  future  if  the  military  services  are 
required  to  enlist  and  train  the  masses  of  unemployable „ 
However,  the  military  services  with  the  exception  of  the  Army 
have  been  able  to  screen  out  most  of  the  untrainables  prior 
to  enlistment. 

This  paper  is  principally  concerned  with  what  happens 
after  enlistment  or  commissioning  since  costs  prior  to  this 
time  are  insignificant  as  far  as  tests  are  concerned «   Pay 
and  allowances  with  variations  for  promotions s  transfers,  etct 
are  relatively  fixed  and  can  be  roughly  considered  as  sunk 
costs  for  the  duration  of  an  enlistment  or  tour  of  active 
duty. 

The  extent  of  testing  officer  and  enlisted  personnel 
on  active  duty  essentially  depends  upon  the  time  available 
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for  testing  and  the  costs  involved  in  testing „   However,  the 
economic  benefits  to  be  derived  hinge  directly  upon  the 
validity  of  the  tests 0   If  test  validity  is  zero  or  positive 
with  a  low  degree  of  confidence ,  the  use  of  tests  is  not 
economically  feasible  because  of  the  expense  involved  in 
testingo 

The  current  procedure  for  selecting  Naval  Officers  for 
Postgraduate  education  is  an  example  of  the  inadequacies  and 
diseconomies  of  tests  as  mental  measurement  devices „   It  is 
evidently  felt  by  military  authorities  that  the  use  of  the 
(OGB)  as  a  decision  making  tool  would  give  rise  to  more 
wrong  decisions  than  correct  decisions.  If  a  test  does  this, 
it  is  an  economic  burden •  Several  studies  have  been  done  by 
Naval  Management  students  on  the  validity  of  the  mathematical 
and  verbal  portions  of  the  (OCB)  as  a  predictor  of  success 
in  the  Management  curriculum,.  Appendix  C  illustrates  in 
plotted  form  the  results  of  one  such  study,,  It  can  easily 
be  seen  that  the  validity  is  near  zeroQ   These  tests  may  be 
valid  as  a  predictor  of  success  in  other  areas 9  but  they  can- 
not be  justified  economically  as  a  tool  for  selecting 
management  students. 

Ideally s  if  we  have  one  thousand  new  inductees  and  one 
thousand  billets  to  fill,  tests  of  intelligence ,  aptitude  and 
abilities s  with  perfect  validity s  would  allow  these  officers 
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and  men  to  be  placed  in  assignments  consistent  with  their 
qualifications.   This  in  turn  would  raise  efficiency ,  iueo 
output  per  man,  and  billets  could  be  deleted  in  direct 
proportion  to  increased  efficiency  inasmuch  as  only  a  given 
level  of  output  is  required  or  can  be  economically  tolerated 
for  defense.   A  testing  program  of  this  magnitude  is 
difficult  to  comprehend  and  perhaps  not  realistic  when  costs 
are  considered,  but  if,  through  testing  and  the  proper  place- 
ment of  personnel,  we  could  achieve  a  one  percent  increase  in 
military  manpower  efficiency  in  FY  1964s,  a  reduction  in  total 
manpower  requirements  would  save  more  than  <|120S000S000  while 
maintaining  the  same  output. 

The  military  services  are  constantly  striving  to  increase 
the  effectiveness  of  their  weapons  at  the  lowest  possible 
cost.  Historically,  manpower  has  been  the  most  effective 
weapon  possessed  by  any  nation  involved  in  conflict.  Manpower 
must  be  considered  as  our  most  effective  weapon  in  any  future 
conflicts,  but  wars  can  not  be  won  in  the  modern  age  if  we 
use  our  resources  in  a  haphazard  manner.   Testing  assists  in 
the  proper  utilization  of  human  resources  and  can  become  a 
more  valuable  tool  in  the  future „ 

Any  tool  such  as  testing,  can  be  misused  and  result  in 
diseconomies  which  are  reflected  in  exorbitant  opportunity 
costs.  These  costs  arise  in  several  ways,  but  the  basic 
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misuses  occur  from  treating  test  results  as  an  absolute 
indicator  when  in  reality  they  should  be  considered  as  a 
sample  which  does  not  reflect  drive ,  motivation 9    interests, 
or  even  aptitudes  clearly,,   Another  misuse  which  clearly 
results  in  opportunity  cost  is  to  ignore  test  results  when 
they  should  be  used,, 

Testing  as  a  tool  for  the  future  presupposes  that  the 
military  services  will  train  personnel  of  the  highest 
calibre  in  Personnel  Management  in  order  that  decisions 
involving  personnel  classifications,  placement,  training,  and 
assignment  will  be  made  which  reflect  service  needs, 
personal  needs  on  the  part  of  members,  and  economies  in 
management • 
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CHAPTER  VI 

CONCLUSIONS  AND  RECOMMENDATIONS 
Ao   CONCLUSIONS 

The  military  services  are  using  paper  and  pencil  tests 
to  measure  intelligence ,  aptitudes,  and  achievements „  These 
tests  are  contributing  to  the  efficient  utilization  of  military 
manpower.   Each  of  the  military  services,  in  their  testing 
programs,  presuppose  unique  personnel  requirements.,   This  is 
difficult  to  fathom  except  for  isolated  occupations 0 

Testing,  for  the  purpose  of  mental  measurement,  has  not 
reached  its  maturity  and  much  basic  research  is  required „ 
If  fact,  testing  for  military  use  is,  at  best,  in  its 
infancy.  Efforts  to  expand  the  frontiers  of  knowledge  have 
been  tenuous  and  narrow  in  military  testing*   Improvements 
have  been  made  in  the  (BTB)  for  Naval  enlisted  personnel,  but 
validity  studies  indicate  a  need  for  better  instruments 0 

Testing  in  its  present  state  is  a  sampling  device 
which  cannot  be  used  effectively  without  considering  back- 
ground factors,  drive,  and  motivation  in  the  assignment  of 
personnel o  The  consideration  of  background  factors „  drive s 
and  motivation  has  not  been  significant  in  the  placement;  of 
Naval  enlisted  personnel  while  these  have  been  the  only 
factors  considered  when  Naval  officer  personnel  are  selected 
for  advanced  or  postgraduate  training « 
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Many  defects  in  our  current  testing  program  exist 
because  we  originally  made  a  cursory  examination  of  the 
occupations  for  which  the  tests  were  created «,   The  expediency 
required  by  war  does  not  justify  this  superficial  approach 
to  job  analysis  during  peacetime « 

Test  validity  studies  justify  the  cost  of  testing  Naval 
enlisted  personnel  if  the  studies  themselves  can  be  accepted 
as  valid,  but  little  is  known  about  the  relationship  between 
job  performance  and  test  scores,  although  much  worthwhile 
information  has  been  gained  from  studies  of  the  relationship 
between  non-performance  and  test  scores 9 

Lastly,  it  is  concluded  that  testing  in  the  military 
services  is  a  necessary  and  important  part  of  Military 
Personnel  Management e 

Bo   RECOMMENDATIONS 

The  following  recommendations  are  made  subject  to 
revision  as  new  and/or  more  reliable  data  becomes  available 0 
lo  The  present  military  testing  program  should  be 
continued.   However,  those  tests  not  deemed  sufficiently 
valid  to  be  used  should  be  discontinued  immediately 8 
2o  Studies  should  be  initiated  by  the  Department  of 
Defense  to  determine  if  the  four  military  services  do 
have  unique  personnel  requirements 0 
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3o   An  office  should  be  established  at  the  Department  of 
Defence  level  to  coordinate  and  evaluate  an  intensive 
research  program,,  This  program  should  be  directed 
toward  occupational  selection  by  individuals  with  an 
advanced  goal  of  success  prediction, 

!+„     A  job  analysis  for  each  billet  should  be  commenced 
by  the  military  services  and  coordinated  by  the 
Department  of  Defense. 

5.  Training  programs  should  be  initiated  by  each  military 
service  to  train  all  personnel  who  make  personnel 
decisions  in  the  uses  and  limitations  of  test  scores „ 
60  Classification  centers  of  each  of  the  military 
services  should  be  staffed  by  personnel  thoroughly 
trained  in  eliciting  background  information  from 
individuals  being  interviewed  as  well  as  ascertaining 
their  motivations,  drives,  and  ambitions.   Centers 
should  be  staffed  with  sufficient  numbers  of  such 
personnel  to  allow  a  minimum  of  one  hour  for  each 
interview.   Personnel  being  classified  should  be  given 
a  definite  or  a  conditional  classification „   Personnel 
who  have  been  given  a  conditional  classification  should 
be  interviewed  again,  at  a  classification  center ,  at 
the  end  of  one  year  and  given  a  definite  classification,, 
7o  Each  of  the  military  services  must  develop  a 
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performance  rating  system  that  will  enable  reviewing 

authorities  to  evaluate  performance  by  the  degree  of 

job  success, 

$0     The  Department  of  Defense  should  request  in  the 

next  military  budget  monies  for  manpower-  utilization 

research * 

9o  Tests  in  current  use  should  be  validated  as  soon  as 

job  analyses  are  complete  and  job  performance 

evaluations  are  available 0 

10 o  The  Department  of  Defense  should  plan  and  coordinate 

the  entire  program  as  previously  outlined  in  brief  and 

a  standardized  military  testing  program  should  be 

developed  at  this  level  as  soon  as  economically  feasible 
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APPENDIX  A 


NORMAL  CURVE 
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13.59$  2.1^0  .13$  Area  Under  Curve 
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20            30  kO  50  60 
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+26       +36         Standard  Deviations 
97.7$  99.9$     Cumulative  Percent 

70         80         *Navy  Standard 

Scores  (T-Scores) 

i 

-**Cumulative 
Percentage 


*     The  range  of  Navy  Standard  Scores  theoretically  extend 
from  zero  to  one  hundred,  but  the  probability  of  a  score 
being  below  20  or  above  $0  is  .13$  and  scores  in  the  extreme 
ranges  is  not  included 

**  Percentage  figures  indicate  what  part  of  population  falls 
below  a  .driven  ooint. 
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APPENDIX  C 
1962  Navy  Management  School  Class 
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